class: center, top, .title-slide, title-slide # Biostats Lecture 5: Statistical Hypothesis Testing ## Public Health 783 ### Ralph Trane ### University of Wisconsin–Madison
### Fall 2019 --- # Recap Random variables Distributions Estimators Estimators are random variables! (for example, the average is a random variable) --- layout: true # Statistical Hypothesis Testing --- **Scenario**: We've been playing a simple game. Everytime you roll a six, I pay you a dollar. Everytime I roll a six, you pay me a dollar. I've had crazy good luck, and by the end of the day won a lot of money from you. -- You accuse me of cheating, and demand to test the dice I've been using! I agree to let you test them, but ONLY if you do it in a sound, statistical manner. How to go about that? -- You decide to roll the dice `\(12\)` times each, for a total of `\(36\)` rolls. You assume they'll all behave the same, so the probability of rolling a six is the same for all three dice. --- **Setup**: Let `\(X_1, X_2, ..., X_{36}\)` be the outcomes of the thirty "trials". Each trial consists of rolling a die, and check if it's a six or not. If it's a six, we'll call it a success, if not we'll call it a failure. I.e. `\(X_i \sim\)` -- `\(\text{Bernoulli}(p)\)`. *IF* the dice are fair, `\(P(X_i = 1) = 1/6\)` for all `\(i = 1,2,...,36\)`. *IF* the dice are fair, we would expect to roll a `\(6\)` close to `\(\frac{1}{6}\cdot 36 = 6\)` times, i.e. about `\(5\)` of the `\(X\)`'s should be `\(1\)`'s. What would cause you to reject the idea that the dice are fair? -- If we see way more than `\(6\)` sixes. What would be "way more"? `\(7\)`? `\(8\)`? `\(17\)`? --- In terms of probabilities: what is the *probability* of observing at least `\(10\)` sixes *IF* the `\(P(X_i = 1) = \frac{1}{6}\)`? * if the probability is small, `\(10\)` is a lot of sixes * if the probability is large, `\(10\)` is a reasonable number of sixes First, introduce the random variable `\(Y =\)` number of sixes `\(= X_1 + X_2 + ... X_{36}\)`. The probability of observing more than `\(10\)` sixes is `\(P(Y \ge 10)\)`. To find this, we need the distribution of `\(Y\)`, which is -- `\(\text{Binomial}(36, p)\)`, where `\(p\)` is the probability of rolling a six. --- *IF* the dice are fair, `\(p = \frac{1}{6}\)`. So *IF* the dice are fair, the distribution of `\(Y\)` looks like this: <center>
</center> --- The probability we want to find is the red area below. We will do this in SAS in just a second. The result is 0.02849. <center>
</center> --- This means that *IF* the true probability of rolling a six with these dice is indeed `\(\frac{1}{6}\)`, the probability of rolling `\(10\)` or more sixes is `\(0.02849\)`. Is this enough to convince you that the true probability is *NOT* `\(\frac{1}{6}\)`? --- </br> </br> </br> </br> .center[ **In SAS.** ] ---